{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "# Python Async Inference Tutorial - Multiple Models with Model Scheduler\n",
    "\n",
    "This tutorial will describe how to run an inference process.\n",
    "\n",
    "\n",
    "**Requirements:**\n",
    "\n",
    "* Run the notebook inside the Python virtual environment: ```source hailo_virtualenv/bin/activate```\n",
    "\n",
    "It is recommended to use the command ``hailo tutorial`` (when inside the ```virtualenv```) to open a Jupyter server that contains the tutorials."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Running Inference using HailoRT\n",
    "\n",
    "In this example we will use the Model Scheduler to run inference on multiple models.\n",
    "Each model is represented by an HEF which is built using the Hailo Dataflow Compiler.\n",
    "An HEF is Hailo's binary format for neural networks. The HEF files contain:\n",
    "\n",
    "* Target HW configuration\n",
    "* Weights\n",
    "* Metadata for HailoRT (e.g. input/output scaling)\n",
    "\n",
    "The Model Scheduler is an HailoRT component that comes to enhance and simplify the usage\n",
    "of the same Hailo device by multiple networks. The responsibility for activating/deactivating the network\n",
    "groups is now under HailoRT, and done **automatically** without user application intervention.\n",
    "In order to use the Model Scheduler, create the VDevice with scheduler enabled, configure all models to the device, and start inference on all models:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Optional: define a callback function that will run after the inference job is done\n",
    "# The callback must have a keyword argument called \"completion_info\".\n",
    "# That argument will be passed by the framework.\n",
    "def example_callback(completion_info, bindings):\n",
    "    if completion_info.exception:\n",
    "        # handle exception\n",
    "        pass\n",
    "        \n",
    "    _ = bindings.output().get_buffer()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from functools import partial\n",
    "from hailo_platform import VDevice, HailoSchedulingAlgorithm, FormatType\n",
    "\n",
    "number_of_frames = 4\n",
    "timeout_ms = 10000\n",
    "\n",
    "def infer(multi_process_service):\n",
    "    # Create a VDevice\n",
    "    params = VDevice.create_params()\n",
    "    params.scheduling_algorithm = HailoSchedulingAlgorithm.ROUND_ROBIN\n",
    "    params.group_id = \"SHARED\" \n",
    "    if multi_process_service:\n",
    "        params.multi_process_service = multi_process_service\n",
    "    \n",
    "    with VDevice(params) as vdevice:\n",
    "\n",
    "        # Create an infer model from an HEF:\n",
    "        infer_model = vdevice.create_infer_model('../hefs/resnet_v1_18.hef')\n",
    "\n",
    "        # Set optional infer model parameters\n",
    "        infer_model.set_batch_size(2)\n",
    "\n",
    "        # For a single input / output model, the input / output object \n",
    "        # can be accessed with a name parameter ...\n",
    "        infer_model.input(\"input_layer1\").set_format_type(FormatType.FLOAT32)\n",
    "        # ... or without\n",
    "        infer_model.output().set_format_type(FormatType.FLOAT32)\n",
    "\n",
    "        # Once the infer model is set, configure the infer model\n",
    "        with infer_model.configure() as configured_infer_model:\n",
    "            for _ in range(number_of_frames):\n",
    "                # Create bindings for it and set buffers\n",
    "                bindings = configured_infer_model.create_bindings()\n",
    "                bindings.input().set_buffer(np.empty(infer_model.input().shape).astype(np.float32))\n",
    "                bindings.output().set_buffer(np.empty(infer_model.output().shape).astype(np.float32))\n",
    "\n",
    "                # Wait for the async pipeline to be ready, and start an async inference job\n",
    "                configured_infer_model.wait_for_async_ready(timeout_ms=10000)\n",
    "\n",
    "                # Any callable can be passed as callback (lambda, function, functools.partial), as long\n",
    "                # as it has a keyword argument \"completion_info\"\n",
    "                job = configured_infer_model.run_async([bindings], partial(example_callback, bindings=bindings))\n",
    "\n",
    "            # Wait for the last job\n",
    "            job.wait(timeout_ms)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Running multiple models concurrently\n",
    "\n",
    "The models can be run concurrently using either multiple `Thread` objects or multiple `Process` objects"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from threading import Thread\n",
    "\n",
    "pool = [\n",
    "    Thread(target=infer, args=(False,)),\n",
    "    Thread(target=infer, args=(False,))\n",
    "]\n",
    "\n",
    "print('Starting async inference on multiple models using threads')\n",
    "\n",
    "for job in pool:\n",
    "    job.start()\n",
    "for job in pool:\n",
    "    job.join()\n",
    "\n",
    "print('Done inference')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If the models are run in different processes, the multi-process service must be enabled first."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from multiprocessing import Process\n",
    "\n",
    "pool = [\n",
    "    Process(target=infer, args=(True,)),\n",
    "    Process(target=infer, args=(True,))\n",
    "]\n",
    "\n",
    "print('Starting async inference on multiple models using processes')\n",
    "\n",
    "for job in pool:\n",
    "    job.start()\n",
    "for job in pool:\n",
    "    job.join()\n",
    "\n",
    "print('Done inference')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
